In [1]:
%load_ext autoreload
%autoreload 2

import plot
import util
import itertools

import embed_helpers
import preprocessing_helpers
import plot_helpers
import cluster_helpers
import same_size_kmeans
import data_loading
import nlp_helpers

import warnings
import requests
import operator

from functools import reduce 

import networkx as nx
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import holoviews as hv
hv.extension('bokeh')
import pickle

import seaborn as sns 
sns.set()

Make sure that you read this notebook on the website:

Website

otherwise you might not see the interactive plots!

Data Cleaning

We implemented data cleaning in the file src/data_cleaning.py. The encountered issues are explained in src/data_cleaning.ipynb.

Load Data

In [2]:
votes = data_loading.votes()
votes.head(1)

The votes dataframe says which councillor voted what for every vote in every legislature.

In [3]:
members = data_loading.members()
members.head(1)

The members dataframe contains background information of all members of the three swiss councils in the period from 2007 to 2019.

In [4]:
full_votes = data_loading.full_votes()
with pd.option_context("display.max_columns", None):
    display(full_votes.head(1))

The full_votes dataframe contains all votes in the national council between 2007 and 2019. It was joined with the members dataframe and contains also all background information for the respective councillors at that time. Note that the background information on a councillor changes over time as they might, for example, change party. The dataframe contains information for about 13500 votes (13500 * 200 rows in total as the national council has 200 members).

In [5]:
ncm_raw = util.split_legislative(data_loading.national_council_members())
legislature_raw = data_loading.votes_legislative()

ncm_legislature and legislature contain the member and votes information split up by legislature.

Preprocessing

To preprocess the votes data, we remove politicans that weren't in parliament during less than half of the votes of a legislature, either because they weren't part of the parliament at the time or because they were busy doing more important things than than what they were elected to do.

In [6]:
leg, ncm = list(zip(*[preprocessing_helpers.filter_impute(l, n) for l,n in zip(legislature_raw, ncm_raw)]))

Representing Politicians

PCA

We start off by reducing the politician votes to its principal components using principal component analysis (PCA) to every legislature.

In [7]:
pc, scalefit, pcafit = zip(*[embed_helpers.do_pca(x.T) for x in leg])
In [8]:
def explained_variance_plot(pcafit):
    expl_var = np.cumsum(pcafit.explained_variance_ratio_)
    plt.plot(expl_var)
    plt.ylabel("Explained variance")
    plt.xlabel("Number Principal Components")
        
plot_helpers.by_legislature(explained_variance_plot, pcafit)

Very few principal components seem to explain 75% of the data variance already, for all legislatures. We thus retain only those principal components for the future analysis.

Vote Importance Score

From the PCA, we can construct an importance score for every individual vote: We do so by determining how much each vote contributes to the most important principal components. A high score means that a vote is very important for determining the politician's general location in the reduced space. We can thus validate the quality of our PCA by looking at the most- and least important votes and determine whether they make sense from a politological point of view.

In [9]:
def vote_importance_score(pc, pcafit):
    """
    Determines vote importance as the summed contribution of the vote to the retained 
    principal components.
    """
    nb_retain = pc.shape[1]
    return np.abs(pcafit.components_[:nb_retain,:]).sum(axis=0)

vote_importance = [vote_importance_score(x, y) for x,y in zip(pc, pcafit)]
In [10]:
plot_helpers.by_legislature(lambda x: plt.hist(x, bins=20), vote_importance)
In [11]:
def vote_importance_df(lix, l, vi):
    """Creates dataframe with voteId, affair title and vote importance score,
    for a given legislature with index lix, data l and vote importance vi,
    Uses global variable full_votes"""
    
    vi_pd = pd.DataFrame({'Vote Importance': vi}, index=l.index)
    leg_ix = full_votes['Legislative'] == lix
    fv_unique = full_votes.drop_duplicates(['VoteId'])
    fv_red = fv_unique.loc[leg_ix,:][['AffairTitle','VoteId']]
    return fv_red.join(vi_pd, on='VoteId', how='left')

vote_importance_df = [vote_importance_df(lix, l, vi) 
                      for lix, (l, vi) 
                      in enumerate(zip(leg, vote_importance))]
In [12]:
def importance_extremes(df, n, which="most"):
    """Prints n most or least important votes"""
    if which=="most":
        subs = df.nlargest(n, "Vote Importance")
    else:
        subs = df.nsmallest(n, "Vote Importance")
    for title in subs["AffairTitle"]:
        print(title)
In [13]:
for i, df in enumerate(vote_importance_df):
    print("\n\n")
    print("LEGISLATURE", i+1)
    print("-------------")
    print("MOST IMPORTANT")
    importance_extremes(df, 5, "most")
    print("\nLEAST IMPORTANT")
    importance_extremes(df, 5, "least")


LEGISLATURE 1
-------------
MOST IMPORTANT
Souveraineté alimentaire et denrées alimentaires de base
Rupture des négociations sur un accord de libre-échange agroalimentaire avec l'UE
Mesures organisationnelles dans le domaine de l'asile
Cybercriminalité. Combler les lacunes du droit pénal
Accord de libre-échange dans le secteur agroalimentaire. Suspendre les négociations avec l'UE

LEAST IMPORTANT
Péages du tunnel du Grand-Saint-Bernard. Non-assujettissement à la taxe sur la valeur ajoutée
Prorogation de la loi fédérale sur l'adaptation des participations cantonales aux coûts des traitements hospitaliers dispensés dans le canton
Convention de la Haye sur la protection des enfants. Enlèvements d'enfants
Budget 2008
Budget 2008



LEGISLATURE 2
-------------
MOST IMPORTANT
Rendre les exportateurs moins dépendants du dollar grâce à un accord monétaire avec la Chine
Mise en oeuvre des recommandations soumises par la CdG-CN en matière de procédures de consultation
Loi sur les produits thérapeutiques. Révision
Stratégie énergétique 2050, premier volet. Pour la sortie programmée de l’énergie nucléaire (Initiative Sortir du nucléaire). Initiative populaire
Loi sur les professions médicales (LPMéd). Modification

LEAST IMPORTANT
Financement transitoire pour les associations faîtières du domaine de la formation continue
Loi sur les épizooties. Modification
Loi sur les épizooties. Modification
Circulation des espèces de faune et de flore protégées. Loi
Budget 2012



LEGISLATURE 3
-------------
MOST IMPORTANT
Budget 2017 assorti du plan intégré des tâches et des finances 2018-2020
Budget 2017 assorti du plan intégré des tâches et des finances 2018-2020
Budget 2018 assorti du plan intégré des tâches et des finances 2019-2021
Développement de l’acquis de Schengen. Création du Fonds pour la sécurité intérieure
Développement de l’acquis de Schengen. Reprise du règlement portant création d’une agence pour des systèmes d‘information

LEAST IMPORTANT
Loi fédérale sur l'impôt anticipé. Modification
Loi autorisant l’approbation d’amendements à l’AETR. Modification
Assurance contre les dommages dus à des événements naturels exploitée par des entreprises d’assurance privées. Accord avec la Principauté de Liechtenstein
Budget 2016
Budget 2016



We see that the score makes sense: Laws on immigration, trade agreements, cybercriminality and nuclear energy are scored as highly important, while laws on non-human diseases, tariffs for a tunnel and assurance harmonization with Liechtenstein are weighted as non-important. Note that for some votes such as the Budget it is hard to judge the importance: As there are many votes on the Budget, some are controversial while others aren't.

Graph Representation

A second way to represent the politicians will be in the form of an undirected graph. The graph is imputed such that every politician is connected to his k nearest neighbors in space. For technical reasons, we want the resulting graph to be connected. Moreover, we will assign as the weight between the two politicians $p_1$ and $p_2$ to be: $\quad w_{ij} = \exp \{ \frac{1}{2\sigma^2} \|p_1 - p_2\|_2^2 \}$

In [ ]:
knn_graph = [embed_helpers.get_knn_graph(x, k=10) for x in pc]
knn_graph_plot_pos = [nx.spring_layout(g, iterations=100, seed=42, k=3/np.sqrt(len(g))) for g in knn_graph]
In [15]:
knn_graph_plot_pos = [nx.spring_layout(g, seed=42) for g in knn_graph]
In [16]:
def draw_graph(tup):
    (G, ncm, pos) = tup
    if pos is None:
        pos=nx.spring_layout(G)
    return plot.graph_from_nx(G, 
                              node_positions=pos,
                              info_df=ncm, 
                              cluster_column='Group',
                              )
    
p = plot_helpers.hv_by_legislature(draw_graph, zip(knn_graph, ncm, knn_graph_plot_pos)).relabel('Graph Representation')
display(p)
plot.group_legend(columns=3)
Out[16]:
In [17]:
# plot.save_plot(p, 'party_assignment')
# plot.save_plot(plot.group_legend(columns=3), 'party_assignment_legend')

Visually, we already can see that the parties separate very well on our graph. There are some interesting things that can be observed: First, the Green Party (PES) has become very close to the Socialist Party (PSS) in the last legislature, making it hard to distinguish. Moreover, the Green Liberal Party (PVL) is very isolated in the second legislative period. This is probably due to the fact that this was the first period during which they were represented in parliament, and thus they tried to vote very similarly in order to gain profile and to achieve their goals. However, they've since gotten closer to the political middle, consisting of the PDC and PBD.

2D-Embedding with TSNE

A third way to represent the politicians will be using TSNE, a manifold embedding into two dimensions. Wheereas the dimensions themselves don't have an interpretation, nodes that are closer on the data manifold will be embedded closer together. We use this to validate the results, in order not to bias our interpretation based on the graph embedding.

In [18]:
tsne = [embed_helpers.do_tsne(x) for x in pc]
In [19]:
def plot_tsne(tup):
    (tsne, ncm) = tup
    return plot.nodes(tsne, info_df=ncm, cluster_column='Group')


plot_helpers.hv_by_legislature(plot_tsne, zip(tsne, ncm)).relabel('TSNE')
Out[19]:

We see that the observations made on the graph embedding still hold here, validating our graph.

Clustering Politicians

We perform clustering on our data in order to see whether we can recover political parties from the data. We do this with two different methods to validate our results.

Spectral Clustering

We use the graph to perform spectral clustering. However, since the graph isn't fully connected we cannot use the classical spectral clustering algorithm. Instead, we implemented a custom version: For every connected component in the graph we perform a spectral embedding. We then look at the first $n$ eigenvalues, where $n$ is chosen such that the average cluster-size consists of at least 20 politicians, and identify the largest gap in two consecutive eigenvalues. The value for which this eigengap is maximized determines the number of clusters that will be looked for in this connected component. The clustering is then done with classical spectral embedding.

In [20]:
spectral_cl = [cluster_helpers.spectral_cluster(g, subsplit_thresh=20) for g in knn_graph]

hover = plot.hover_tool({'cluster_color': 'Cluster', 'FullName': '', 'GroupNameEN': ''})
def visualize_clusterings(tup):
    if len(tup) > 3:
        (graph, coordinates, clusters, ncm) = tup
        return plot.graph_from_nx(graph, coordinates, cluster_column=clusters, info_df=ncm, hover=hover).opts(cmap='Category20')
    else:
        (coordinates, clusters, ncm) = tup
        return plot.nodes(coordinates, cluster_column=clusters, info_df=ncm, hover=hover).opts(cmap='Category20')
In [21]:
p = plot_helpers.hv_by_legislature(visualize_clusterings, zip(knn_graph, knn_graph_plot_pos, spectral_cl, ncm)).relabel("Spectral Clustering on Graph")
p
Out[21]:
In [22]:
# plot.save_plot(p, 'spectral_clustering')
In [23]:
plot_helpers.hv_by_legislature(visualize_clusterings, zip(tsne, spectral_cl, ncm)).relabel("Spectral Clustering on TSNE")
Out[23]:

We see that the clustering we obtained makes visual sense on both the graph and TSNE representation. This confirms that we don't bias the clustering with our graph structure.

K-Means Clustering

To validate our results, we use k-means clustering, where we find the optimal number of clusters using the knee-plot. Note that since it is only for validation, we refrain from using the more rigorous silhouette plot for determining the number of clusters.

In [24]:
plot_helpers.by_legislature(cluster_helpers.inertia_plot, pc)

We see that the knee is in approximately 4, 5 and 3 clusters approximately.

In [25]:
kmeans_cl = [cluster_helpers.get_kmeans_clusters(x,k) for x,k in zip(pc, [4,5,3])]
In [26]:
plot_helpers.hv_by_legislature(visualize_clusterings, zip(knn_graph, knn_graph_plot_pos, kmeans_cl, ncm)).relabel("K Means Clustering on Graph")
Out[26]:

The clustering results are only comparable to a limited extent since the spectral clustering method yields a higher number of clusters. Nevertheless, the clustering seems to correspond well. One thing stands out however: In the second Legislature, two members of the Christian Democrats are clustered together with the Green Liberals. On the graph however they are not connected. This means that the Green Liberals aren't as isolated as it could be thought from the graph representation. We keep this in mind for future analysis.

Vote Topic Detection

We want to cluster votes into different topics, in order to be identify the position of a politician on a specific matter. To this effect, we try two approaches: We perform topic analysis on the vote titles, and we cluster votes based on how the politicians voted on them.

The NLP Approach

In our NLP Approach for topic detection, we proceed as follows: We identify the nouns in every title and create a TF-IDF matrix.

In [27]:
affair_title = [nlp_helpers.get_affair_title(full_votes, l) for l in leg]
In [28]:
# Uncomment to recalculate
# title_nouns = [nlp_helpers.get_title_nouns(at) for at in affair_title] # Takes 2min
# with open('../generated/affair_title_nouns.pickle', 'wb') as file:
#     pickle.dump(title_nouns, file, protocol=pickle.HIGHEST_PROTOCOL)

# Load pre-calculated
with open('../generated/affair_title_nouns.pickle', "rb") as file:
    title_nouns = pickle.load(file)
In [29]:
word_dicts = [nlp_helpers.word_dict(x) for x in title_nouns]
noun_count_matrices = [nlp_helpers.get_count_matrix(n, d) for n, d in zip(title_nouns, word_dicts)]
noun_count_matrices_noplur = [nlp_helpers.handle_plural(c.copy()) for c in noun_count_matrices]
reduced_noun_count = [nlp_helpers.get_reduced_count(c) for c in noun_count_matrices_noplur]
noun_idf = [nlp_helpers.get_idf(c) for c in reduced_noun_count]
noun_tf_idf = [c * idf for c, idf in zip(reduced_noun_count, noun_idf)]

We perform a matrix factorization of the tf-idf matrix using PCA.

In [30]:
tfidf_pc, _, tfidf_pc_fit = zip(*[embed_helpers.do_pca(d, ev) for d, ev in zip(noun_tf_idf, [.8, .8, .9])])
In [31]:
plot_helpers.by_legislature(explained_variance_plot, tfidf_pc_fit)

When we look at the explained variance plot, we see that a lot of principal components are needed in order to explain a decent amount of the variance within the tf-idf matrix. This means that our matrix is too sparse, and that the embedding of the words in topic space will not yield conclusive results.

The Data-Supported Approach

In this approach, we remove all votes that are unimportant from the analysis, as determined by our vote-importance score calculated above. Then we perform PCA on the votes and cluster the votes with KMeans clustering. Finally, we look at the vote titles within every cluster in order to assign a topic to every cluster.

In [32]:
def filter_votes(l, imp, q=.5):
    thresh = np.quantile(imp, q)
    return l.loc[imp>thresh,:]

only_imp = [filter_votes(l, imp) for l, imp in zip(leg, vote_importance)]
In [33]:
vote_pc, _, vote_pc_fit = zip(*[embed_helpers.do_pca(d, ev) for d, ev in zip(only_imp, [.8, .8, .9])])
In [34]:
plot_helpers.by_legislature(explained_variance_plot, vote_pc_fit)

Looking at the explained variance plot, we identify the number of principal components to retain as where the plot starts flattening. For the first we take 75% of the variance, for the second 82% and for the third 92%.

Again, we determine the number of clusters with the knee-plot.

In [35]:
plot_helpers.by_legislature(cluster_helpers.inertia_plot, vote_pc)

The plots are inconclusive for Legislature 1 and 2, but we take 7 for both cases. For the last case, we take 4 clusters.

In [36]:
vote_kmeans_cl = [cluster_helpers.get_kmeans_clusters(x,k) for x,k in zip(vote_pc, [7,7,4])]
In [37]:
vote_tsne = [embed_helpers.do_tsne(d) for d in vote_pc]
In [38]:
def plot_vote_tsne(tup):
    (vote_tsne, cl) = tup
    plt.scatter(vote_tsne[:,0], vote_tsne[:,1], c=cl, cmap='jet')

plot_helpers.by_legislature(plot_vote_tsne,zip(vote_tsne, vote_kmeans_cl))

When looking at the clustering visualized on a TSNE plot, we see that the clustering doesn't seem very meaningful. The vote data seems very fragmented, and the cluster assignment doesn't do a good job in recovering those assignment. We therefore conclude that we cannot recover topics by clustering the important votes. Note that the amount of votes taken into account for this analysis doesn't change this.

Vote Topic Detection: A Failure?

With the two approaches we tried, it was unfortuately not possible to obtain the vote topics from the data. The NLP approach failed because we didn't have enough text per vote. In theory, for every vote there is a full transcript of the parliamentary discussion available. However, since the API of the Swiss government isn't implemented yet and as web-crawling was too slow, we couldn't include it in our analysis. This corpus may be sufficient in order to do robust topic detection.

Another thing that could be tried would be to combine our two approaches. The combined information of text and voting data may be sufficient to recover topics. To implement this one could for instance create a graph of title similarity, and a graph of vote similarity between two votes. Those could be combined into a two-layered graph, on which spectral clustering can be performed.

3. The Swiss Executive Power

In Switzerland the executive branch of government consists of a council of 7 people, which are chosen to roughly represent the political landscape of Switzerland. They are elected by the federal assembly (national council as well as the council of states). We are interested in finding out if their pick can be justified by our data. There is no obvious reason why a member of the council is elected over another, although it seems from the media coverage that the party of a candidate plays a major role. Yet we have seen that belonging to a party did not necessarily guaranteed a voting pattern nor an active support to other party members. Does the assumption of party membership importance hold under the light of our analysis? If party membership was a key element, the elected member would need to be representative of his party, which we simplify as being “central” in the cluster of the party. To test the sub-mentioned assumption, we look at the votes of members of the National Council in the legislature antecedent their election to the Federal Council to see whether they could be considered as “central” at the time of the elections.

3.1 Finding the Federal counsilors

The first step is to find the national council members that went on to become federal counselors and whose voting behavior is in our database.

In [39]:
#List of Federal Councilors active in the second or third legislature 
cf_name = ["Cassis Ignazio","Berset Alain","Keller-Sutter Karin","Amherd Viola","Parmelin Guy","Sommaruga Simonetta","Maurer Ueli","Burkhalter Didier","Leuthard Doris",'Schneider-Ammann Johann N.']
# See how many politicians we have by legislature
for i,legislature in enumerate(ncm):
    cf = legislature.loc[legislature["FullName"].isin(cf_name)]
    print("\nLegislature {}:\n\n We have {} politicians :\n\n {} \n".format(i+1,len(cf),cf["FullName"].values))
Legislature 1:

 We have 4 politicians :

 ['Schneider-Ammann Johann N.' 'Parmelin Guy' 'Amherd Viola'
 'Cassis Ignazio'] 


Legislature 2:

 We have 3 politicians :

 ['Parmelin Guy' 'Amherd Viola' 'Cassis Ignazio'] 


Legislature 3:

 We have 2 politicians :

 ['Amherd Viola' 'Cassis Ignazio'] 

We see that our analysis will be limited by the fact that only 4 federal councillors voting record are in the database. Therefore the conclusions drawn from our analysis should be taken with a grain of salt. We chose here to only look at information from the first legislature as it is the only one where all councilors are present.

In [40]:
# Get all info for the federal councilors
federal_counsilor_info = ncm[0].loc[ncm[0]["FullName"].isin(cf_name)]

3.2 Visualizing the Executive Power

Now let's see where those politicians are in the representation of our data we made above.

3.2.1 TSNE Representation

In [41]:
federal_counsilor_pos_tsne = np.asarray([tsne[0][i] for i in federal_counsilor_info.index])
In [42]:
all_points = plot.nodes(tsne[0], info_df=ncm[0], cluster_column='Group').opts(alpha=1)
fc_points = plot.nodes(federal_counsilor_pos_tsne, federal_counsilor_info)
leg = plot.group_legend(['Federal Councillors'],["pink"], columns=1)
real_tsne_07_11 = (all_points*fc_points.options(color='pink') + leg).relabel("TSNE of Councillors Voting Trends in '07-'11")
real_tsne_07_11
Out[42]:
In [43]:
# plot.save_plot(real_tsne_07_11, "real_tsne_07_11");

We see that the councillors are not necessarily central in the cluster of their party. This could indicate that having a voting pattern representative of one's party is not relevant to be elected for the federal council. We can see if this results is maintained in the graph representation.

3.2.2 Graph Representation

In [44]:
federal_counsilor_pos_graph = np.asarray([knn_graph_plot_pos[0][i] for i in federal_counsilor_info.index])
overall_graph = plot.graph_from_nx(knn_graph[0], node_positions=knn_graph_plot_pos[0] ,info_df=ncm[0],cluster_column='SimplePartyAbbreviation')
federal_councillor_nodes = plot.nodes(federal_counsilor_pos_graph, federal_counsilor_info)
leg = plot.group_legend(['Federal Councillors'],["pink"])
real_graph = (overall_graph*federal_councillor_nodes.options(color='pink') + leg).relabel("Councillors Voting Trends in '07-'11 Legislative")
real_graph
Out[44]:
In [45]:
# plot.save_plot(real_graph,"real_graph");

Again the points do not seem so central. Acknowledging that the visual approach is limited, we look for a more precise measure of centrality. As the graph is not connected, we decide to use harmonic centrality as a metric.

3.2.3 Node Centrality

Rationally, it would seem that the more "central" a point is, the more people are agreeing with it. We can make the hypothesis that a good executive power member is someone who agrees with the most members of the legislative power. This makes sense if we want the lawmakers to be happy with the way the law is applied. To measure the centrality, since the network is disconnected, we look at the Harmonic centrality. This is defined as:

Harmonic centrality [1] of a node u is the sum of the reciprocal of the shortest path distances from all other nodes to u

[1] Boldi, Paolo, and Sebastiano Vigna. “Axioms for centrality.” Internet Mathematics 10.3-4 (2014): 222-262.

In [46]:
plot_helpers.plot_closeness([knn_graph[0]],[ncm[0]],[federal_counsilor_info],["Node Harmonic centrality per Councillor"],nx.harmonic_centrality);

Although never in the highest values, some of the counsilors seem to have a high enough Harmonic centrality. This would fit the hypothesis that members of the electoral power need to "agree" with a majority. Yet Guy Parmelin is very low in the ranking and challenges our hypothesis. This is most likely due to isolatedness of the UDC cluster. It also reminds us that is unrealistic for a politician to satisfy everyone. The more interesting question is to see if the elected councilors are agreeing with the majority of their party. We can also make the hypothesis that a central node can be said to be representative of the cluster. Therefore a central politician means a political representative of its party. To obtain the representative we calculate the centrality on a graph made by politicians of the same party. Since the graphs would now be connected, we can use as a centrality measure the Katz Centrality, which is more robust. This is defined as :

Katz centrality computes the relative influence of a node within a network by measuring the number of the immediate neighbors (first degree nodes) and also all other nodes in the network that connect to the node under consideration through these immediate neighbors.[1]

[1] Leo Katz: A New Status Index Derived from Sociometric Index. Psychometrika 18(1):39–43, 1953 http://phya.snu.ac.kr/~dkim/PRL87278701.pdf

In [47]:
federal_counsilor_info
Out[47]:
Legislative FullName CouncillorId PartyAbbreviation SimplePartyAbbreviation SimplePartyId Active BirthPlace_Canton BirthPlace_City CantonAbbreviation ... Image LastName Mandates MaritalStatusText ParlGroupAbbreviation ParlGroupName PartyName Group GroupId GroupNameEN
61 0.0 Schneider-Ammann Johann N. 508 PLR PLR 3.0 False Berne Sumiswald BE ... https://www.parlament.ch/sitecollectionimages/... Schneider-Ammann NaN marié(e) RL Groupe radical-libéral Parti radical-démocratique suisse RL 3 Liberals
82 0.0 Parmelin Guy 1108 UDC UDC 1.0 False Vaud Bursins VD ... https://www.parlament.ch/sitecollectionimages/... Parmelin NaN NaN V Groupe de l'Union démocratique du Centre Union Démocratique du Centre V 0 Swiss People's Party
124 0.0 Amherd Viola 1288 PDC PDC 4.0 False Valais Brigue VS ... https://www.parlament.ch/sitecollectionimages/... Amherd Exekutive der Gemeinde (Stadträtin) Brig-Glis:... NaN CEg Groupe PDC/PEV/PVL Parti démocrate-chrétien suisse CE 2 Christian Democrats
135 0.0 Cassis Ignazio 3828 PLR PLR 3.0 False Tessin Sessa TI ... https://www.parlament.ch/sitecollectionimages/... Cassis Legislativo del comune Collina d'Oro: da april... NaN RL Groupe radical-libéral Parti radical-démocratique suisse RL 3 Liberals

4 rows × 29 columns

In [48]:
parties = ['UDC', 'PLR', 'PDC']
show_yaxis = [True, True, False]
centrality_rankings = [plot_helpers.party_centrality_ranking(party, ncm[0], pc[0], federal_counsilor_info) for party in parties]
centrality_plots = [plot.closeness(party, *ranking, nx.katz_centrality, yaxis) for party, ranking, yaxis in zip(parties, centrality_rankings, show_yaxis)]
centrality_0 = centrality_plots[0]
centrality_1 = (centrality_plots[1] + centrality_plots[2]).opts(toolbar='right')
display(centrality_0);
display(centrality_1);

None of the elected councillors seem to have a voting pattern that is very representative of their political party. It seems that a lot of other candidates would be better was the selection criteria of having a political stances representative of one's party. This is in opposition with the way the media depicts the election. The interpretability of those results is of course very limited by the scarcity of the data.

In [49]:
# plot.save_plot(centrality_0, 'centrality_plot_0')
# plot.save_plot(centrality_1, 'cnetrality_plot_1')

3.3 Picking a "Better" Executive Power

The generally accepted idea that a politician is elected to the Federal Council mostly based on its political party is challenged by our previous results. Yet, this is a key-stone argument in all debates surrounding the elections. This is even truer now, as the recent election for the Federal Council has raised a new polemic. The new council is criticized for not being representative enough of the current swiss political landscape. The major issue is the continued absence of councilors from the Green Party (PES), which has significantly grown in size in the last decade.

We make the claim that we can make a pick that is not based on political party, but simply on the voting pattern of the National Councilors. Of course the executive power is not only chosen from the National Council, so our solution isn't optimal.

To make our pick we make the following assumptions:

  • Voting patterns are a good indication of the political orientation of a councilor
  • The optimal pick for the Federal Council is 7 councilors whose political orientation is best representing 7 equal sized sub-groups of councilors
  • A group is best represented by a point at his center.

We first try to do this for the based only on the data of the first legislature to have a comparison point with our previous findings.

3.3.1 Determining sub-groups

To obtain 7 sub-groups of equal size of the national council members, we implemented a Kmeans variation that forced the clusters to have a similar size.

The algorithm follows this principal :

  • Set the initial clusters center to resulting cluster center after applying a classical KMeans
  • Order the points in decreasing order of distance to nearest cluster - distance to futher cluster
  • Put every cluster in list of availability
  • Starting with first point on the list, assign point to closest cluster in list of availability
  • When a cluster is full, remove it from list of availability
  • When all clusters are full, add the remaining points to their closest clusters.

We applied this method to each connected sub_graph in the graph.

In [50]:
clusters = same_size_kmeans.per_subgraph(knn_graph[0],[2,2,3],pc[0],random_state = 22)
# Plotting the 7 found clusters
graph = plot.graph_from_nx(knn_graph[0], node_positions=knn_graph_plot_pos[0] ).opts(node_alpha=0.7)
colors = ["purple","blue","green","yellow","cyan","grey","red"]
cluster_graphs = [plot_helpers.plot_clusters(i,color,graph,clusters,ncm[0],knn_graph_plot_pos[0]) for i,color in zip(range(7),colors)]

graph * reduce((lambda x, y: x * y), cluster_graphs)
Out[50]:

We see that the clustering seems to be working, we can plot the histogram to verify that the clusters are indeed of similar size.

In [51]:
plt.hist(pd.DataFrame(clusters).rename(columns={0:"Clusters Size"}).to_numpy()+1, 
         bins=np.linspace(.5,7.5,8))
plt.xlabel('Cluster')
plt.ylabel('Size')
plt.show()

Even if there are some small variations of the size of the clusters, we still decide that this is small enough to not be problematic.

3.3.2 Finding the best representation of each subgroup

Now that we have established a satisfactory partition of the council members, we need to find the best representation of this partition. Using the assumption that a group is best represented by someone central in it, and measuring centrality using Katz centrality, we find the most representative members of each cluster.

In [52]:
# Let's see who our new pick would be 
max_nodes= [cluster_helpers.get_best_node(i,clusters,pc[0]) for i in range(7)]
In [53]:
# visualizing our pick
our_councillor = ncm[0].iloc[max_nodes]
our_councillor[["FullName","PartyAbbreviation"]]
Out[53]:
FullName PartyAbbreviation
92 Rime Jean-François UDC
55 Pfister Theophil UDC
28 Thanei Anita PSS
126 John-Calame Francine PES
148 Eichenberger-Walther Corina PLR
96 de Buman Dominique PDC
19 Hochreutener Norbert PDC

Several observations can already be made:

  • There is no overlapping with the actual list of federal chancellor
  • Both Jean-François Rime and de Buman Dominique where a candidate for the Federal Council and were not elected
  • There is a member of the Green Party (PES) which have historically never had a seat on the Swiss Federal Council
  • One seat of the Liberal Party (PLR) that currently holds two seats is given to the Christian Democrats.
In [54]:
# Adding our pick of federal councillor
our_councillor_pos_graph = np.asarray([knn_graph_plot_pos[0][i] for i in our_councillor.index])
our_councillor_plot = plot.nodes(our_councillor_pos_graph,our_councillor)
new_pick = (graph * reduce((lambda x, y: x * y), cluster_graphs) * our_councillor_plot.options(color="pink")).relabel("Our Pick for the '11-'15 Legislative")
new_pick
Out[54]:
In [55]:
#plot.save_plot(new_pick,"new_pick");

We see that they are indeed very central in their clusters. Based on the assumption we made, they would be a good pick for the Federal Council.

3.3.3 Solving the 2019 Federal Council polemic

As mentioned before, the election for the Federal Council on the 11th of December 2019 has raised concern about the representativity of the Federal Counsil. Using the methodology created above, we came up with a new set of councilors that might be more representative.

In [56]:
# Create the sub_groups for the last Legislature
clusters = same_size_kmeans.per_subgraph(knn_graph[2],[5,2],pc[2],random_state = 7)# Plotting the 7 found clusters
graph = plot.graph_from_nx(knn_graph[2], node_positions=knn_graph_plot_pos[2] ).opts(node_alpha=0.7)
colors = ["purple","blue","green","yellow","cyan","grey","red"]
cluster_graphs = [plot_helpers.plot_clusters(i,color,graph,clusters,ncm[2],knn_graph_plot_pos[2]) for i,color in zip(range(7),colors)]
In [57]:
# Find the new pick based on votting trend of the last legislature
max_nodes= [cluster_helpers.get_best_node(i,clusters,pc[2]) for i in range(7)]
our_councillor = ncm[2].iloc[max_nodes]
our_councillor[["FullName","PartyAbbreviation"]]
Out[57]:
FullName PartyAbbreviation
128 Romano Marco PDC
18 Sommaruga Carlo PSS
117 Bertschy Kathrin PVL
131 Schilliger Peter PLR
123 Tornare Manuel PSS
147 Addor Jean-Luc UDC
155 Grüter Franz UDC
In [59]:
# Visualizing them 
our_councillor_pos_graph = np.asarray([knn_graph_plot_pos[2][i] for i in our_councillor.index])
our_councillor_plot = plot.nodes(our_councillor_pos_graph,our_councillor)
clusters_name = ['Cluster '+ str(i) for i in range(7)]
next_legislative = (graph * reduce((lambda x, y: x * y), cluster_graphs) * our_councillor_plot.options(color="pink")).relabel("Our Pick for the '20-'24 Legislative")
next_legislative
Out[59]:
In [60]:
#plot.save_plot(next_legislative,"next_legislative_pick");

There is again no overlap between out pick and the current Federal Councillor. Unlike the pick from the first legislature, none of our counselors were even a candidate at a Federal Council election. This indicates that our method is quite different that what is currently implemented. We also see that no member of the Green Party is part of our pick, even though a member of the Green Liberal (PVL) who historically never had a federal councillor did make it into the list.

This choice is limited by the fact that it is only representative of the National Council and not the State Council. The latter also has no members elected which is also a limitation. If there is no way of knowing that this pick would make a good executive power, it would at least be a bold new choice.

The pick could also be improved on it's egalitarian value. This concerns gender, but also the balance of Swiss German, Swiss Romand and Ticinese. We could force this balance but this would require coming up with a set of rules to which the federal composition should abide and this goes out of the scope of our analysis.